I have been working on converting the Himmelstein (2022) advice-taking dual-hurdle model from a two-level cross-classified model, with responses cross-classified within judges and items, into a three-level cross-classified model, with responses cross-classified within judges and items and then nested within studies. To do this, I added a third level to the multilevel structure of the model that would account for study-level variance in the intercept estimates in both the part of the model that estimates the probabilities of Decline, Adopt, and Compromise (DAC) as well as the part that estimates the continuous Weight of Advice (WOA) values by modifying the Stan code and implementation in R. I then fit the model to the full dataset of WOA studies.
Code
Stan Model Code
data { int<lower =1> N; // number of observations int<lower =2> ncat; // number of DAC categories int Y1[N]; // categorical response variable forhurdle (D =2, A =3, or C =1?) int<lower =1> K; // number of population-level effects post hurdle matrix[N, K] X; // population-level design matrix post hurdle// data for group-level effects of ID 1= person int<lower =1> N_1; // number of grouping levels int<lower =1> M_1; // number of coefficients per level (random effects) int<lower =1> J_1[N]; // grouping indicator per observation vector[N] Z_1_1; // group 1 (person)-level predictor values// data for group-level effects of ID 2= item int<lower =1> N_2; // number of grouping levels int<lower =1> M_2; // number of coefficients per level int<lower =1> J_2[N]; // grouping indicator per observation vector[N] Z_2_1; // group 2 (item)-level predictor values// data for group-level effects of ID 3= study int<lower=1> N_3; // number of grouping levels int<lower=1> M_3; // number of coefficients per level int<lower=1> J_3[N]; // grouping indicator per observation vector[N] Z_3_1; // group 2 (study)-level predictor values// data for outcomes vector[N] Y2; // response variable vector[N] Y2_complete; // response variable// priors vector[K -1] b_prior; int prior_only; // should the likelihood be ignored?// other stuff int<lower =1> N_test; // number of observations in test data matrix[N_test, K -1] X_test; // test data}transformed data { int Kc = K -1; matrix[N, Kc] Xc; // centered version of X without an intercept vector[Kc] means_X; // column means of X before centeringfor (i in2:K) { means_X[i -1] =mean(X[, i]); Xc[, i -1] = X[, i] - means_X[i -1]; }}parameters { vector[Kc] b_eta1; // population-level effects real Intercept_eta1; // temporary intercept for centered predictors vector[Kc] b_eta2; // population-level effects real Intercept_eta2; // temporary intercept for centered predictors vector<lower =0>[M_1] sd_1; // group-level standard deviations vector[N_1] z_1[M_1]; // standardized group-level effects vector<lower =0>[M_2] sd_2; // group-level standard deviations vector[N_2] z_2[M_2]; // standardized group-level effects vector<lower =0>[M_1] sd_3; // group-level standard deviations vector[N_1] z_3[M_1]; // standardized group-level effects vector<lower =0>[M_2] sd_4; // group-level standard deviations vector[N_2] z_4[M_2]; // standardized group-level effects vector<lower=0>[M_3] sd_5; // group-level standard deviations vector[N_3] z_5[M_3]; // standardized group-level effects vector<lower =0>[M_3] sd_6; // group-level standard deviations vector[N_3] z_6[M_3]; // standardized group-level effects vector[Kc] b; // population-level effects real Intercept; // temporary intercept for centered predictors vector<lower =0>[M_1] sd_1a; // group-level standard deviations vector[N_1] z_1a[M_1]; // standardized group-level effects vector<lower =0>[M_2] sd_2a; // group-level standard deviations vector[N_2] z_2a[M_2]; // standardized group-level effects vector<lower =0>[M_1] sd_1b; // group-level standard deviations vector[N_1] z_1b[M_1]; // standardized group-level effects vector<lower =0>[M_2] sd_2b; // group-level standard deviations vector[N_2] z_2b[M_2]; // standardized group-level effects vector<lower =0>[M_3] sd_3a; // group-level standard deviations vector[N_3] z_3a[M_3]; // standardized group-level effects vector<lower =0>[M_3] sd_3b; // group-level standard deviations vector[N_3] z_3b[M_3]; // standardized group-level effects real Intercept_phi; // temporary intercept for centered distance var vector[Kc] b_dist; // population-level effects}transformed parameters { vector[N_1] r_1_eta1_1; // actual group-level effects vector[N_2] r_2_eta1_1; // actual group-level effects vector[N_1] r_3_eta2_1; // actual group-level effects vector[N_2] r_4_eta2_1; // actual group-level effects vector[N_3] r_5_eta1_1; // actual group-level effects vector[N_3] r_6_eta2_1; // actual group-level effects vector[N_1] r_1_1; // actual group-level effects vector[N_2] r_2_1; // actual group-level effects vector[N_1] r_1_1b; // actual group-level effects vector[N_2] r_2_1b; // actual group-level effects vector[N_3] r_3_1; // actual group-level effects vector[N_3] r_3_1b; // actual group-level effects r_1_eta1_1 = (sd_1[1] * (z_1[1])); r_2_eta1_1 = (sd_2[1] * (z_2[1])); r_3_eta2_1 = (sd_3[1] * (z_3[1])); r_4_eta2_1 = (sd_4[1] * (z_4[1])); r_5_eta1_1 = (sd_5[1] * (z_5[1])); r_6_eta2_1 = (sd_6[1] * (z_6[1])); r_1_1 = (sd_1a[1] * (z_1a[1])); r_2_1 = (sd_2a[1] * (z_2a[1])); r_1_1b = (sd_1b[1] * (z_1b[1])); r_2_1b = (sd_2b[1] * (z_2b[1])); r_3_1 = (sd_3a[1] * (z_3a[1])); r_3_1b = (sd_3b[1] * (z_3b[1]));}model {// initialize linear predictor term vector[N] eta1 = Intercept_eta1 + Xc * b_eta1;// initialize linear predictor term vector[N] eta2 = Intercept_eta2 + Xc * b_eta2;// linear predictor matrix vector[ncat] eta[N];// beta post hurdle// initialize linear predictor term vector[N] mu = Intercept + Xc * b; vector[N] phi = Intercept_phi + Xc * b_dist; vector[N] A; // parameter for beta distn vector[N] B; // parameter for beta distnfor (n in1:N) {// add more terms to the linear predictor eta1[n] += r_1_eta1_1[J_1[n]] * Z_1_1[n] + r_2_eta1_1[J_2[n]] * Z_2_1[n] + r_5_eta1_1[J_3[n]] * Z_3_1[n];// add more terms to the linear predictor eta2[n] += r_3_eta2_1[J_1[n]] * Z_1_1[n] + r_4_eta2_1[J_2[n]] * Z_2_1[n] + r_6_eta2_1[J_3[n]] * Z_3_1[n]; eta[n] = [0, eta1[n], eta2[n]]'; // add more terms to the linear predictor mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_2_1[J_2[n]] * Z_2_1[n] + r_3_1[J_3[n]] * Z_3_1[n]; phi[n] += r_1_1b[J_1[n]] * Z_1_1[n] + r_2_1b[J_2[n]] * Z_2_1[n] + r_3_1b[J_3[n]] * Z_3_1[n]; A[n] = inv_logit(mu[n]) * exp(phi[n]); B[n] = (1 - inv_logit(mu[n])) * exp(phi[n]); } // priors including all constants target += normal_lpdf(b_eta1 | 0, b_prior); b_eta1 ~ normal(0, b_prior); target += normal_lpdf(b_eta2 | 0, b_prior); target += normal_lpdf(Intercept_eta1 | 0, 5); target += normal_lpdf(Intercept_eta2 | 0, 5); target += student_t_lpdf(sd_1 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_1[1]); target += student_t_lpdf(sd_2 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_2[1]); target += student_t_lpdf(sd_3 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_3[1]); target += student_t_lpdf(sd_4 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_4[1]); target += student_t_lpdf(sd_5 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_5[1]); target += student_t_lpdf(sd_6 | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_6[1]); // priors including all constants target += normal_lpdf(b | 0, b_prior); target += normal_lpdf(Intercept | 0, 5); target += normal_lpdf(b_dist | 0, b_prior); target += normal_lpdf(Intercept_phi | 0, 5); target += student_t_lpdf(sd_1a | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_1a[1]); target += student_t_lpdf(sd_2a | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_2a[1]); target += student_t_lpdf(sd_1b | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_1b[1]); target += student_t_lpdf(sd_2b | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_2b[1]); target += student_t_lpdf(sd_3a | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_3a[1]); target += student_t_lpdf(sd_3b | 3, 0, 2.5) - 1 * student_t_lccdf(0 | 3, 0, 2.5); target += std_normal_lpdf(z_3b[1]); // likelihood including all constants if (!prior_only) { for (n in 1:N) { if (Y1[n] == 2){ 2 ~ categorical_logit(eta[n]); } else if (Y1[n] == 3){ 3 ~ categorical_logit(eta[n]); } else { 1 ~ categorical_logit(eta[n]); Y2[n] ~ beta(A[n], B[n]); } } }}generated quantities { // actual population-level intercept real b_eta1_Intercept = Intercept_eta1 - dot_product(means_X, b_eta1); // actual population-level intercept real b_eta2_Intercept = Intercept_eta2 - dot_product(means_X, b_eta2); // actual population-level intercept real b_Intercept = Intercept - dot_product(means_X, b); // actual population-level intercept real b_Intercept_phi = Intercept_phi - dot_product(means_X, b_dist); // initialize linear predictor term real eta1; // initialize linear predictor term real eta2; //real A[N]; //real B[N]; vector[N_test] eta1_t = b_eta1_Intercept + X_test * b_eta1; vector[N_test] eta2_t = b_eta2_Intercept + X_test * b_eta2; // initialize linear predictor term //testing results real pc1_t[N_test]; real pc2_t[N_test]; real pc3_t[N_test]; vector[ncat] eta; vector[N] log_lik; real y_rep[N]; real y1_rep; // initialize linear predictor term real mu; real phi; vector[N_test] mu_t = b_Intercept + X_test * b; vector[N_test] phi_t = exp(b_Intercept_phi + X_test * b_dist); real A_t[N_test]; real B_t[N_test]; for (n in 1:N) { // add more terms to the linear predictor eta1 = Intercept_eta1 + Xc[n] * b_eta1 + r_1_eta1_1[J_1[n]] * Z_1_1[n] + r_2_eta1_1[J_2[n]] * Z_2_1[n] + r_5_eta1_1[J_3[n]] * Z_3_1[n]; // add more terms to the linear predictor eta2 = Intercept_eta2 + Xc[n] * b_eta2 + r_3_eta2_1[J_1[n]] * Z_1_1[n] + r_4_eta2_1[J_2[n]] * Z_2_1[n] + r_6_eta2_1[J_3[n]] * Z_3_1[n]; eta = [0, eta1, eta2]'; // add more terms to the linear predictor mu = Intercept + Xc[n] * b + r_1_1[J_1[n]] * Z_1_1[n] + r_2_1[J_2[n]] * Z_2_1[n] + r_3_1[J_3[n]] * Z_3_1[n]; phi = Intercept_phi + Xc[n] * b_dist + r_1_1b[J_1[n]] * Z_1_1[n] + r_2_1b[J_2[n]] * Z_2_1[n] + r_3_1b[J_3[n]] * Z_3_1[n]; y1_rep =categorical_logit_rng(eta);if (y1_rep ==2){ y_rep[n] =0; } elseif (y1_rep ==3){ y_rep[n] =1; }else { y_rep[n] =beta_rng(inv_logit(mu) *exp(phi), (1-inv_logit(mu)) *exp(phi)); }if (Y1[n] ==2){ log_lik[n] =categorical_logit_lpmf(2| eta); }elseif (Y1[n] ==3){ log_lik[n] =categorical_logit_lpmf(3| eta); }else { log_lik[n] =categorical_logit_lpmf(1| eta); log_lik[n] +=beta_lpdf(Y2[n] |inv_logit(mu) *exp(phi), (1-inv_logit(mu)) *exp(phi)); } }for (n2 in1:N_test){ pc2_t[n2] =exp(eta1_t[n2])/(1+exp(eta1_t[n2]) +exp(eta2_t[n2])); pc3_t[n2] =exp(eta2_t[n2])/(1+exp(eta1_t[n2]) +exp(eta2_t[n2])); pc1_t[n2] =1- pc2_t[n2] - pc3_t[n2]; A_t[n2] =inv_logit(mu_t[n2]) * phi_t[n2]; B_t[n2] = (1-inv_logit(mu_t[n2])) * phi_t[n2]; }}
R Code Model Setup
here::here()library(rstan)library(bridgesampling)library(loo)library(bayesplot)library(bayestestR)library(tidyverse)options(scipen =999)options(mc.cores = parallel::detectCores())rstan_options(auto_write = T)set.seed(123)dat_full <-read.csv(here::here("woa_datasets.csv"))studynames <- dat_full %>%select(!c(gender, published, base, female_percentage, mean_age, student_judge, almanac, future, incentive, abs_value, winsor)) %>%mutate(distance =abs(firstestimate - advice) / firstestimate) %>%filter(!is.na(woa_raw) &!is.na(zconfidence) & distance <1) %>%select(study) %>%summarize(.by = study,studyname =first(study),study =as.character(cur_group_id()))dat <- dat_full %>%select(!c(gender, published, base, female_percentage, mean_age, student_judge, almanac, future, incentive, abs_value, winsor)) %>%mutate(distance =abs(firstestimate - advice) / firstestimate) %>%filter(!is.na(woa_raw) &!is.na(zconfidence) & distance <1) %>%mutate(.by = study,study =cur_group_id()) %>%mutate(.by = id,id =cur_group_id()) %>%mutate(.by =c(study, trial),trial =cur_group_id())dat %>%summary()## Create trimmed version of DWOA (probably unnecessary)dat$woa_winsor_trim <- dat$woa_winsordat$woa_winsor_trim[dat$woa_winsor ==1] <- .999dat$woa_winsor_trim[dat$woa_winsor ==0] <- .001## Subset predictors and create interaction termsx <- dat %>%mutate(intercept =1,distance = distance,confidence = zconfidence,distance_confidence = distance * confidence) %>%select(intercept, distance, confidence, distance_confidence)# ,# randomadvice = random_advice,# expertadvisor = expert_advisor,# random_expert = randomadvice * expertadvisor# ,# randomadvice, expertadvisor, random_expert## Create test/toy datatestdat <-expand.grid(distance =c(mean(x$distance) -sd(x$distance),mean(x$distance),mean(x$distance) +sd(x$distance)),confidence =c(mean(x$confidence) -sd(x$confidence),mean(x$confidence),mean(x$confidence) +sd(x$confidence))) %>%mutate(distance_confidence = distance * confidence)# ,# randomadvice = c(0, 1),# expertadvisor = c(0, 1)# ,# random_expert = randomadvice * expertadvisorcondition_names <-expand.grid(distance =c("DL", "DM", "DH"),confidence =c("CL", "CM", "CH")) %>%mutate(condition =paste0(distance, confidence)) %>%pull(condition)# ,# randomadvice = c("NR", "YR"),# expertadvisor = c("NE", "YE")# ,# lvl3_condition = paste0(randomadvice, confidence)## Scale priorspr_v <-rep(NA, ncol(x) -1) #no prior for the intercept?for (i in1:length(pr_v)){ sdi <-sd(x[, i +1], na.rm = T)if(length(levels(factor(x[, i]))) ==2) { # if binary? pr_v[i] <-3 }else { pr_v[i] <-3/ sdi #why }}## Create DAC Variabledat$DAC <-ifelse(dat$woa_winsor ==0, 2, ifelse(dat$woa_winsor ==1, 3, 1))k <-ncol(x)## Run modelhelmer_stan <-stan("DAC Helmer 2.stan",data =list(N =nrow(dat), # number observationsncat =3, #number of DAC categoriesY1 = dat$DAC, # DAC variableK = k, #number of predictors + 1 for interceptX = x, #predictor matrixN_1 =length(unique(dat$id)), # number of unique participantsM_1 =1, # number of random effects for person, just intercepts hereJ_1 = dat$id, # participant idsN_2 =length(unique(dat$trial)), # number of unique itemsM_2 =1, # number of random effects for itemsJ_2 = dat$trial, # item idsN_3 =length(unique(dat$study)), # number of unique studiesM_3 =1, # number of random effects for studiesJ_3 = dat$study, # study idsY2 = dat$woa_winsor_trim, #continuous response variable for beta regressionY2_complete = dat$woa_winsor, #untrimmed versionZ_1_1 =rep(1, nrow(dat)), # random effect for person (just 1s for intercept)Z_2_1 =rep(1, nrow(dat)), # random effect for item (just 1s for intercept)Z_3_1 =rep(1, nrow(dat)), # random effect for study (just 1s for intercept)b_prior = pr_v, # scaled priors for coefficientsprior_only =0, # draw from prior only and ignore likelihood?N_test =nrow(testdat), # number of observations in test dataX_test = testdat), # test datawarmup =1000, iter =2500,seed =50401, init_r = .2)saveRDS(helmer_stan, file = here::here("helmer_stan_2_4.rds"))
Displayed below is a plot of the distribution of WOA values as per the data and as per the model predictions. We can see the large proportion of WOA values near zero and one, with a range of values between as the DAC framework would expect.
One noticeable piece here is the bump at .5 exactly, with lower density surrounding it. This seems to be a study-level phenomenon—some studies have many responses with WOA values of exactly .5. Below are the studies with the highest proportions of the .5 responses.
The average proportion of .5’s in an given study was 5.24%.
It may be helpful to identify why certain studies have more responses of exactly .5 than others—the model seems to be fitting the data well across all values except the center.
Below we can see the distribution of the study-level intercepts for the probability of responses falling into the DAC categories. To the right we can see those box plots representing the distribution of those sampled intercepts for each study. We can see decent variability in these distributions between studies.
Next step is to decide how to handle these high .5 studies and to add predictors to the model! Looking forward to hearing what predictors would be best to include in this analysis. Also, if there are other ways of analyzing and presenting these results that would be helpful, please let me know!